Code for implementing iMAP is designed with four bundles of commands wrapped individually in bash driver scripts for performing bioinformatic analysis of microbiome data (Figure 1). The output is then transformed into data structure suitable for conducting exploratory visualization. A progress report is generated at each major analysis step to summarize the output.
Schematic illustration of the iMAP pipeline.
The first step is to gather all materials needed for implementing the iMAP pipeline smoothly (Table S1).
| Required | Description | Folder | Remarks |
|---|---|---|---|
| Hardware | Computer with multi-core processor: preferably 64-bit. | ||
| Remote Accessory Memory (RAM): 8 GB minimum. | |||
| Storage: Tens of gigabytes for small dataset otherwise a few terabytes | |||
| iMAP pipeline | Bundled scripts for microbiome data analysis | iMAP | Link |
| Raw data | Demultiplexed reads in FASTQ format with primers and barcodes removed | data/references | |
| Sample metadata | A tab-separated file showing sample identifiers, categorical and numeric variables | data/metadata | |
| Mapping file | A file that links sample IDs (1st column) to the names of forward (2nd column) and reverse (3rd column) data files | data/references | |
| Design files | Files that assign samples to a specific variables or other categories | data/references | |
| Software | |||
| sekit | For inspecting rawdata format and simple statistics. | code & ~/bin | Link |
| FASTQC | For creating base call quality score images and statistics. Requires the latest Java Development Kit (JDK). | code & ~/bin | Link |
| bbmap | For trimming poor quality reads | code | Link |
| bbduk.sh | For trimming poor quality reads | code & ~/bin | Link |
| multiqc | For summarizing FASTQc output | $PATH | Link |
| Mothur | For sequence processing and classifying the sequences and preliminary analysis | code & ~/bin | Link |
| Statistical analysis and visualization | |||
| R | For statistical analysis and visualization | $PATH | Link |
| Rstudio | An IDE (integrated development environment) for R | Link | |
| iTOL | For display, annotation and management of phylogenetic trees | Link | |
| Reference 16S rRNA gene alignments | |||
| SILVA (nr) | Reference rRNA alignments | data/references | Link |
| Reference 16S rRNA gene classifiers | |||
| SILVA(no gap) | Degapped using degap.seqs function in Mothur | data/references | Link |
| RDP | Mothur-formatted | data/references | Link |
| Greengenes | Mothur-formatted | data/references | Link |
| EzBioCloud | Mothur-formatted | data/references | Link |
| Custom classifiesr | Any manually built classifiers. Highly recommended when studying a specific group of known microbes. | ||
git clone https://github.com/tmbuza/iMAP.git
cd iMAP
# OR
curl -LOk https://github.com/tmbuza/iMAP/archive/master.zip
unzip master.zip
mv iMAP-master iMAP
rm -rf master.zip
cd iMAP
# OR
wget --no-check-certificate https://github.com/tmbuza/iMAP/archive/master.zip
unzip master.zip
mv iMAP-master iMAP
rm -rf master.zip
cd iMAP
# Mac
bash ./code/requirements/iMAP_requirements_mac_driver.bash
# Linux
bash ./code/requirements/iMAP_requirements_linux_driver.bash
bash ./code/requirements/iMAP_checkFiles_driver.bash
Figure S1: Major folders in the iMAP root directory. Missing files must be found by the above script before proceeding.
This is basically a method where users sequentially run individual or bundle scripts on CLI (Command -Line_Interface) one at a time. We have bundled workflow-specific scripts into a driver to make the analysis easily implemented on CLI by just a single click.
# linux
bash ./code/requirements/iMAP_requirements_linux_driver.bash
# mac OS
bash ./code/requirements/iMAP_requirements_mac_driver.bash
# windows
bash ./code/requirements/iMAP_requirements_windows_driver.bash
Keep going…
bash ./code/requirements/iMAP_checkFiles_driver.bash
bash ./code/preprocessing/iMAP_preprocessing_driver.bash
bash ./code/summarizeFastQC/iMAP_multiqc_driver.bash
bash ./code/mockcommunity/iMAP_mockcommunity_driver.bash
bash ./code/seqprocessing/iMAP_seqprocessing_driver.bash
bash ./code/seqclassification/iMAP_seqclassification_driver.bash
bash ./code/seqerrorrate/iMAP_seqerrorrate_driver.bash
bash ./code/otutaxonomy/iMAP_otutaxonomy_driver.bash
bash ./code/annotation/01_processed_seqs.bash
bash ./code/dataanalysis/iMAP_dataanalysis_demo_driver.bash
The iMAP_driver.bash is the master driver for running all analyses on CLI at once.
bash ./code/linux_iMAP_driver.bash
bash ./code/mac_iMAP_driver.bash
bash ./code/windows_iMAP_driver.bash
# Optionally you can use time tracking driver
bash ./code/linux_time_tracking_driver.bash
bash ./code/mac_time_tracking_driver.bash
bash ./code/windows_time_tracking_driver.bash
Users must create a Portable Batch System (PBS) script that describes cluster resources to be used, parameters for the job and the commands to be executed. The following is a PBS script for running executing iMAP pipeline remotely. Note that you must provide the group allocation name (-A) but this may differ from one system to the other. Google for help just in case.
Individual driver
#!/bin/bash -f
#PBS iMAPtest
#PBS -A group allocation name
#PBS -l nodes=1:ppn=8
#PBS -l walltime=4000:00:00
#PBS -l pmem=20gb
#PBS -j oe
#PBS -o iMAPtest.log
#PBS -m abe
#PBS -M tmb72@psu.edu
cd $PBS_O_WORKDIR
bash ./code/requirements/iMAP_requirements_linux_driver.bash
bash ./code/requirements/iMAP_checkFiles_driver.bash
bash ./code/preprocessing/iMAP_preprocessing_driver.bash
bash ./code/summarizeFastQC/iMAP_multiqc_driver.bash
bash ./code/mockcommunity/iMAP_mockcommunity_driver.bash
bash ./code/seqprocessing/iMAP_seqprocessing_driver.bash
bash ./code/seqclassification/iMAP_seqclassification_driver.bash
bash ./code/seqerrorrate/iMAP_seqerrorrate_driver.bash
bash ./code/otutaxonomy/iMAP_otutaxonomy_driver.bash
bash ./code/annotation/01_processed_seqs.bash
bash ./code/dataanalysis/iMAP_dataanalysis_demo_driver.bash
Batch mode
#!/bin/bash -f
#PBS iMAPtest
#PBS -A group allocation name
#PBS -l nodes=1:ppn=8
#PBS -l walltime=4000:00:00
#PBS -l pmem=20gb
#PBS -j oe
#PBS -o iMAPtest.log
#PBS -m abe
#PBS -M tmb72@psu.edu
cd $PBS_O_WORKDIR
bash code/linux_iMAP_driver.bash
| variable | q_zeros | p_zeros | q_na | p_na | q_inf | p_inf | type | unique |
|---|---|---|---|---|---|---|---|---|
| SampleID | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 360 |
| BarcodeSequence | 0 | 0.00 | 360 | 100 | 0 | 0 | character | 0 |
| ForwardPrimerSequence | 0 | 0.00 | 360 | 100 | 0 | 0 | character | 0 |
| ReversePrimerSequence | 0 | 0.00 | 360 | 100 | 0 | 0 | character | 0 |
| ForwardRead | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 360 |
| ReverseRead | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 360 |
| origGroup | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 360 |
| Sex | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 2 |
| Time | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 2 |
| DayID | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 35 |
| DPW | 12 | 3.33 | 0 | 0 | 0 | 0 | integer | 35 |
| Description | 0 | 0.00 | 0 | 0 | 0 | 0 | character | 1 |
Key: q_zeros: quantity of zeros or missing data; q_na: quantity of NA; q_inf: quantity of infinite values; type: factor, character, integer or numeric..; unique: quantity of unique values. Percentages are indicated by p_.
When reviewing metadata status you may notice that numeric variables are confused with numeric data. For example, variable DPW (days post weaning) on Table x shows 12 missing values (3.33%) which is incorrect. As investigators we know that day 1 was coded zero (0) (12 zeros for 12 mouse) which in descriptive statistics is interpreted as missing data. To correct this we re-coded the samples with unique identifies shown as DayID to distinguish them from integers. However, depending of experiment this kind of variables need to be converted to character during analysis.
First 5 lines of mapping metadata
| SampleID | origGroup | Sex | Time | DayID | DPW |
|---|---|---|---|---|---|
| F3D000 | F3D0 | Female | Early | D000 | 0 |
| F3D001 | F3D1 | Female | Early | D001 | 1 |
| F3D002 | F3D2 | Female | Early | D002 | 2 |
| F3D003 | F3D3 | Female | Early | D003 | 3 |
| F3D005 | F3D5 | Female | Early | D005 | 5 |
Last 5 lines of mapping metadata
| SampleID | origGroup | Sex | Time | DayID | DPW |
|---|---|---|---|---|---|
| M6D148 | M6D148 | Male | Late | D148 | 148 |
| M6D149 | M6D149 | Male | Late | D149 | 149 |
| M6D150 | M6D150 | Male | Late | D150 | 150 |
| M6D175 | M6D175 | Male | Late | D175 | 175 |
| M6D364 | M6D364 | Male | Late | D364 | 364 |
Figure 3. Frequency of experimental variables
Number of forward reads before QC (Original_R1): 3634461 ( 100 % )
Number of reverse reads before QC (Original_R2): 3634461 ( 100 % )
Total number of reads before QC: 7268922
Figure x: ensity and histograms plots showing forward and reverse read length. Dotted line indicates mean value for the specified variable
Grouped by Sex variable
Figure x: Barplots of pre-processed reads grouped by sex variable.
Grouped by Time variable
Figure x: Barplots of pre-processed reads grouped by time variable
Grouped by days-post-weaning (D) variable
Figure x: Barplots of pre-processed reads grouped by time variable
open ./results/multiqc/qc0/multiqc_report.html
open ./results/multiqc/qc0/R1/multiqc_report.html
open ./results/multiqc/qc0/R2/multiqc_report.html
Figure x: Example of report generated by running open ./results/multiqc/qc0/multiqc_report.html on CLI
open ./results/multiqc/qctrim25/multiqc_report.html
open ./results/multiqc/qctrim25/R1/multiqc_report.html
open ./results/multiqc/qctrim25/R2/multiqc_report.html
Figure x: Example of report generated by running open ./results/multiqc/qced/R2/multiqc_report.html on CLI
open ./results/multiqc/qced/multiqc_report.html
open ./results/multiqc/qced/R1/multiqc_report.html
open ./results/multiqc/qced/R2/multiqc_report.html
Figure x: Example of report generated by running open ./results/multiqc/qced/R2/multiqc_report.html on CLI
| variable | q_zeros | p_zeros | q_na | p_na | q_inf | p_inf | type | unique |
|---|---|---|---|---|---|---|---|---|
| SampleID | 0 | 0 | 0 | 0 | 0 | 0 | character | 360 |
| origGroup | 0 | 0 | 0 | 0 | 0 | 0 | character | 360 |
| Sex | 0 | 0 | 0 | 0 | 0 | 0 | character | 2 |
| Time | 0 | 0 | 0 | 0 | 0 | 0 | character | 2 |
| DayID | 0 | 0 | 0 | 0 | 0 | 0 | character | 35 |
| Original_R1 | 0 | 0 | 0 | 0 | 0 | 0 | numeric | 357 |
| TrimQ25_R1 | 0 | 0 | 0 | 0 | 0 | 0 | numeric | 357 |
| NophiX_R1 | 0 | 0 | 0 | 0 | 0 | 0 | numeric | 359 |
| Original_R2 | 0 | 0 | 0 | 0 | 0 | 0 | numeric | 357 |
| TrimQ25_R2 | 0 | 0 | 0 | 0 | 0 | 0 | numeric | 357 |
| NophiX_R2 | 0 | 0 | 0 | 0 | 0 | 0 | numeric | 359 |
Key: q_zeros: quantity of zeros or missing data ; q_na: quantity of NA; q_inf: quantity of infinite values; type: factor, character, integer or numeric..; unique: quantity of unique values. Percentages are indicated by p_.
Figure x: Stacked barplots of pre-processed reads grouped by QC variable
Number of forward reads before QC (Original_R1): 3634461 ( 100 % )
Number of reverse reads before QC (Original_R2): 3634461 ( 100 % )
Number of forward reads after trimming at Q=25 (TrimQ25_R1): 3631940 ( 99.93064 % )
Number of reverse reads after trimming at Q=25 (TrimQ25_R2): 3631940 ( 99.93064 % )
Number of forward reads after phiX removal (NophiX_R1): 3631769 ( 99.92593 % )
Number of reverse reads after phiX removal (NophiX_R2): 3631769 ( 99.92593 % )
| Original_R1 | TrimQ25_R1 | NophiX_R1 | Original_R2 | TrimQ25_R2 | NophiX_R2 | |
|---|---|---|---|---|---|---|
| Min. : 14 | Min. : 14 | Min. : 14 | Min. : 14 | Min. : 14 | Min. : 14 | |
| 1st Qu.: 5368 | 1st Qu.: 5365 | 1st Qu.: 5365 | 1st Qu.: 5368 | 1st Qu.: 5365 | 1st Qu.: 5365 | |
| Median : 8001 | Median : 7996 | Median : 7996 | Median : 8001 | Median : 7996 | Median : 7996 | |
| Mean :10096 | Mean :10089 | Mean :10088 | Mean :10096 | Mean :10089 | Mean :10088 | |
| 3rd Qu.:13636 | 3rd Qu.:13630 | 3rd Qu.:13630 | 3rd Qu.:13636 | 3rd Qu.:13630 | 3rd Qu.:13630 | |
| Max. :40113 | Max. :40077 | Max. :40077 | Max. :40113 | Max. :40077 | Max. :40077 |
| SeqLength | Overlap_Length | MisMatches | |
|---|---|---|---|
| Min. : 23.0 | Min. : 0.0 | Min. : 0.000 | |
| 1st Qu.:252.0 | 1st Qu.:127.0 | 1st Qu.: 0.000 | |
| Median :252.0 | Median :160.0 | Median : 0.000 | |
| Mean :243.7 | Mean :164.6 | Mean : 2.098 | |
| 3rd Qu.:253.0 | 3rd Qu.:237.0 | 3rd Qu.: 1.000 | |
| Max. :502.0 | Max. :254.0 | Max. :119.000 |
| QueryLength | PairwiseAlignmentLength | SimBtwnQuery.Template | |
|---|---|---|---|
| Min. :140.0 | Min. :140.0 | Min. : 60.76 | |
| 1st Qu.:252.0 | 1st Qu.:252.0 | 1st Qu.: 90.87 | |
| Median :253.0 | Median :253.0 | Median : 93.25 | |
| Mean :250.6 | Mean :250.7 | Mean : 93.42 | |
| 3rd Qu.:253.0 | 3rd Qu.:253.0 | 3rd Qu.: 98.02 | |
| Max. :292.0 | Max. :294.0 | Max. :100.00 |
Screened Aligned Denoised NonChimeric
Min. : 8 Min. : 6 Min. : 6 Min. : 6
1st Qu.: 4689 1st Qu.: 4658 1st Qu.: 4658 1st Qu.: 4336
Median : 7052 Median : 7012 Median : 7012 Median : 6590
Mean : 8819 Mean : 8767 Mean : 8767 Mean : 8119
3rd Qu.:11937 3rd Qu.:11870 3rd Qu.:11870 3rd Qu.:10740
Max. :34866 Max. :34590 Max. :34590 Max. :32161
BacteriaOnly NoMock
Min. : 6 Min. : 6
1st Qu.: 4328 1st Qu.: 4328
Median : 6578 Median : 6578
Mean : 8113 Mean : 8113
3rd Qu.:10733 3rd Qu.:10733
Max. :32156 Max. :32156
Figure x: Stacked barplots of processed sequences grouped by processing variable
The opticlust algorithm yielded high quality OTUs based on the high precision metrics below.
| Parameter | Value |
|---|---|
| cutoff | 0.2 |
| Sensitivity | 0.998 |
| Specificity | 0.999 |
| PPV | 0.998 |
| NPV | 0.999 |
| Accuracy | 0.999 |
| MCC | 0.997 |
| F1score | 0.998 |
| FDR | 0.002 |
Figure x: Unfiltered and curated OTU abundance. Visual representation of taxon terms highlight the most abundant taxon based on frequency of being assigned to an OTU or tree nodes. Muribaculaceae is the most frequently assigned family and Muribaculaceae_ge is the most frequent species assigned to most sequences.
OTUs
[1] "Otu001" "Otu002" "Otu003" "Otu004" "Otu005" "Otu006" "Otu007"
[8] "Otu008" "Otu010" "Otu011" "Otu012" "Otu013" "Otu014" "Otu016"
[15] "Otu017" "Otu019" "Otu020" "Otu021" "Otu022" "Otu023" "Otu024"
[22] "Otu026" "Otu028" "Otu029" "Otu031" "Otu032" "Otu037" "Otu038"
[29] "Otu041" "Otu045"
Phylum
[1] "Bacteroidetes" "Firmicutes"
Class
[1] "Bacilli" "Bacteroidia" "Clostridia"
[4] "Erysipelotrichia"
Order
[1] "Anaeroplasmatales" "Bacteroidales" "Bifidobacteriales"
[4] "Clostridiales" "Erysipelotrichales" "Lactobacillales"
[7] "Mollicutes_RF39" "Verrucomicrobiales"
Family
[1] "Akkermansiaceae" "Bacteroidaceae"
[3] "Bifidobacteriaceae" "Clostridiaceae_1"
[5] "Clostridiales_vadinBB60_group" "Erysipelotrichaceae"
[7] "Lachnospiraceae" "Lactobacillaceae"
[9] "Mollicutes_RF39_fa" "Muribaculaceae"
[11] "Peptococcaceae" "Rikenellaceae"
[13] "Ruminococcaceae"
Genus
[1] "Acetatifactor" "Akkermansia"
[3] "Alistipes" "Anaerotruncus"
[5] "Bacteroides" "Bifidobacterium"
[7] "Candidatus_Arthromitus" "Clostridiales_vadinBB60_group_ge"
[9] "Lachnoclostridium" "Lachnospiraceae_NK4A136_group"
[11] "Lachnospiraceae_UCG.001" "Lactobacillus"
[13] "Mollicutes_RF39_ge" "Muribaculaceae_ge"
[15] "Oscillibacter" "Roseburia"
[17] "Ruminiclostridium" "Ruminiclostridium_5"
[19] "Ruminiclostridium_9" "Ruminococcaceae_ge"
[21] "Turicibacter"
Figure x. Rank abundance of eight selected samples. Package: goeveg
Figure x. Correlation between species identified at phylum-level. Species are ordered alphabetically (top panel) and heuristically (bottom panel)
Figure x: Stacked barplots for species richness. The estimated richness (green bars) was calculated using chao calculator and observed ichness (red bars) was calculated using sobs.
Figure x: Species richness (observed species) displayed by boxplot (A), density plots (B) and histograms (C).
Figure x: Correlation between species richness and sequence depth. Observed species calculated using sobs (A) and estimated species richness by chao calculator (B).
Figure x: Species diversity and correlation to species richness. Definitely phylo-diversity (C) correlates well with the species richness.
Figure x: Species diversity estimates as a function of sample size. Only species with abundance greater or equal to 1 are detected in the sample.
Figure x. Rarefaction and extrapolation curves. Sample-size-based curve (A), sample completeness curve (B), Coverage-based curves (C).
OTUbased
Number_clusters Value_Index
2.0000 22.3718
Phylum
Number_clusters Value_Index
2.0000 114.5654
Class
Number_clusters Value_Index
2.0000 73.9034
Order
Number_clusters Value_Index
2.00 42.17
Family
Number_clusters Value_Index
2.000 31.017
Genus
Number_clusters Value_Index
2.000 24.511
Figure x: Optimal number of OTU clusters. The suggested number of best clusters (dotted line) that could explain most variation is 2 for OTUs (A), 3 for phylum (B), 3 for class (C), 2 for Order (D), 10 for Family (E) and 2 for Genus (F). A high average Silhouette width indicates high quality clustering.
cluster size ave.sil.width
1 1 232 0.67
2 2 128 0.25
cluster size ave.sil.width
1 1 240 0.68
2 2 120 0.24
cluster size ave.sil.width
1 1 239 0.67
2 2 121 0.22
cluster size ave.sil.width
1 1 239 0.67
2 2 121 0.22
cluster size ave.sil.width
1 1 229 0.67
2 2 131 0.19
cluster size ave.sil.width
1 1 241 0.68
2 2 119 0.29
Figure x: Silhouette plot guided by the best number of clusters. Observations with a large Si (almost 1) are very well clustered. A small Si (around 0) means that the observation lies between two clusters while a negative Si are probably placed in the wrong cluster.
Example: Early vs Late DPW
Index numbers for Female early & late DPW
[1] 1 2 3 4 5 6 7 8 9 10 233 234 235 236 237 238 239
[18] 240 241 242
Index numbers for Male early & late DPW
[1] 19 20 21 22 23 24 25 26 27 28 253 254 255 256 257 258 259
[18] 260 261 262
cluster size ave.sil.width
1 1 31 0.72
2 2 9 0.48
cluster size ave.sil.width
1 1 31 0.69
2 2 9 0.54
cluster size ave.sil.width
1 1 31 0.69
2 2 9 0.52
cluster size ave.sil.width
1 1 31 0.69
2 2 9 0.52
cluster size ave.sil.width
1 1 31 0.70
2 2 9 0.46
cluster size ave.sil.width
1 1 32 0.71
2 2 8 0.52
Figure x: Scree plot of PCA. Shows which components explain most of the variability in the data. Over 80% of the variances contained in OTU and taxonomy data are retained by the first two principal components. The first PC explains the maximum amount of variation in the data set.
Figure x: Principal coordinate analysis ordination using Bray-Curtis dissimilarity matrix. Objects that are ordinated closer together have smaller dissimilarity values than those ordinated further apart. A successful PCoA will capture most of the variation in the (dis)similarity matrix in a few PCoA axes.
OTUs
----------------------------
Call:
metaMDS(comm = otu.t[, -1], distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(otu.t[, -1]))
Distance: bray
Dimensions: 3
Stress: 0.1182664
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(otu.t[, -1]))'
Phylum
----------------------------
Call:
metaMDS(comm = phylum.t[, -1], distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(phylum.t[, -1]))
Distance: bray
Dimensions: 3
Stress: 0.1201002
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(phylum.t[, -1]))'
Class
----------------------------
Call:
metaMDS(comm = class.t[, -1], distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(class.t[, -1]))
Distance: bray
Dimensions: 3
Stress: 0.1402302
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(class.t[, -1]))'
Order
----------------------------
Call:
metaMDS(comm = order.t[, -1], distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(order.t[, -1]))
Distance: bray
Dimensions: 3
Stress: 0.143402
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(order.t[, -1]))'
Family
----------------------------
Call:
metaMDS(comm = family.t[, -1], distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(family.t[, -1]))
Distance: bray
Dimensions: 3
Stress: 0.1343065
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(family.t[, -1]))'
Genus
----------------------------
Call:
metaMDS(comm = genus.t[, -1], distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(genus.t[, -1]))
Distance: bray
Dimensions: 3
Stress: 0.0001578126
Stress type 1, weak ties
Two convergent solutions found after 10 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(genus.t[, -1]))'
Figure X. Sherperd and non-metric multidimensional scaling plot. Green oints represent samples and red points represent OTU or species. Similar samples are ordinated together. Stress values are shown at the botthom of ordination plot.
Figure x: Sample Phylip or Newick-formatted tree clustered using the UPGMA (Unweighted Pair Group Method with Arithmetic Mean) algorithm. Similar data was used to construct different types of tree including rectangular (A), circular (B) and unrooted (C) to view how samples were clustered.